53 research outputs found

    Semantic Video Indexing Using MPEG Motion Vectors

    Get PDF
    With the diffusion of large video databases and "electronic program guides", the problem of semantic video indexing is of great interest. In literature we can found many video indexing algorithms, based on various types of low-level features, but the problem of semantic video indexing is less studied and surely it is a great challenging one. In this paper we present a particular semantic video indexing algorithm based on the motion information extracted from MPEG compressed bit-stream. This algorithm is an example of solution to the problem of finding a semantic event (scoring of a goal) in case of specific type of sequences (soccer video)

    Image Retrieval with Random Bubbles

    Get PDF
    In this work we propose an algorithm for content based im−age retrieval based on random selection of circular bubbles on the reference image. More specifically, an image finger−print vector is extracted from the image, the components of which are simple statistical parameters associated to the luminance values in some selected circular areas of the im−age. The positions and radius of these bubbles result from a random selection, with characteristics defined by the user. In this way, the extracted fingerprint is very robust with respect to linear and nonlinear distortion of the image. Experiments based on the detection of various linearly and nonlinearly distorted versions of a test image in a large database have shown very promising results

    Hierarchical Structuring of Video Previews by Leading-Cluster-Analysis

    Get PDF
    3noClustering of shots is frequently used for accessing video data and enabling quick grasping of the associated content. In this work we first group video shots by a classic hierarchical algorithm, where shot content is described by a codebook of visual words and different codebooks are compared by a suitable measure of distortion. To deal with the high number of levels in a hierarchical tree, a novel procedure of Leading-Cluster-Analysis is then proposed to extract a reduced set of hierarchically arranged previews. The depth of the obtained structure is driven both from the nature of the visual content information, and by the user needs, who can navigate the obtained video previews at various levels of representation. The effectiveness of the proposed method is demonstrated by extensive tests and comparisons carried out on a large collection of video data. of digital videos has not been accompanied by a parallel increase in its accessibility. In this context, video abstraction techniques may represent a key components of a practical video management system: indeed a condensed video may be effective for a quick browsing or retrieval tasks. A commonly accepted type of abstract for generic videos does not exist yet, and the solutions investigated so far depend usually on the nature and the genre of video data.openopenBenini, Sergio; Migliorati, Pierangelo; Leonardi, RiccardoBenini, Sergio; Migliorati, Pierangelo; Leonardi, Riccard

    An Overview of Multimodal Techniques for the Characterization of Sport Programmes

    Get PDF
    The problem of content characterization of sports videos is of great interest because sports video appeals to large audiences and its efficient distribution over various networks should contribute to widespread usage of multimedia services. In this paper we analyze several techniques proposed in literature for content characterization of sports videos. We focus this analysis on the typology of the signal (audio, video, text captions, ...) from which the low-level features are extracted. First we consider the techniques based on visual information, then the methods based on audio information, and finally the algorithms based on audio-visual cues, used in a multi-modal fashion. This analysis shows that each type of signal carries some peculiar information, and the multi-modal approach can fully exploit the multimedia information associated to the sports video. Moreover, we observe that the characterization is performed either considering what happens in a specific time segment, observing therefore the features in a "static" way, or trying to capture their "dynamic" evolution in time. The effectiveness of each approach depends mainly on the kind of sports it relates to, and the type of highlights we are focusing on

    Audio Classification in Speech and Music: A Comparison between a Statistical and a Neural Approach

    Get PDF
    We focus the attention on the problem of audio classification in speech and music for multimedia applications. In particular, we present a comparison between two different techniques for speech/music discrimination. The first method is based on Zero crossing rate and Bayesian classification. It is very simple from a computational point of view, and gives good results in case of pure music or speech. The simulation results show that some performance degradation arises when the music segment contains also some speech superimposed on music, or strong rhythmic components. To overcome these problems, we propose a second method, that uses more features, and is based on neural networks (specifically a multi-layer Perceptron). In this case we obtain better performance, at the expense of a limited growth in the computational complexity. In practice, the proposed neural network is simple to be implemented if a suitable polynomial is used as the activation function, and a real-time implementation is possible even if low-cost embedded systems are used

    Video Coding with Motion Estimation at the Decoder

    Get PDF
    Predictive video coding is based on motion estimation. In such systems the temporal correlation is exploited at the encoder, whereas at the decoder the correlation between the previously decoded frames and the current frame is never exploited. In this paper we propose a method for motion estimation at the decoder. Based on the prediction residue and on the already decoded frames, the decoder is able to partially reconstruct the motion field, which therefore can be skipped in the encoded stream. The proposed approach is based on Least Square Estimation prediction (LSE), and is suitable for low bit-rate video coding, where the transmission of the motion field has a significant impact on the overall bit-rate. The same technique could also be useful in case of high definition video coding where a detailed and accurate motion field is required. Preliminary results seem to be very promising

    Retrieval of video story units by Markov entropy rate

    Get PDF
    In this paper we propose a method to retrieve video stories from a database. Given a sample story unit, i.e., a series of contiguous and semantically related shots, the most similar clips are retrieved and ranked. Similarity is evaluated on the story structures, and it depends on the number of expressed visual concepts and the pattern in which they appear inside the story. Hidden Markov models are used to represent story units, and Markov entropy rate is adopted as a compact index for evaluating structure similarity. The effectiveness of the proposed approach is demonstrated on a large video set from different kinds of programmes, and results are evaluated by a developed prototype system for story unit retrieval

    Statistical Skimming of Feature Films

    Get PDF
    We present a statistical framework based on Hidden Markov Models (HMMs) for skimming feature films. A chain of HMMs is used to model subsequent story units: HMM states represent different visual-concepts, transitions model the temporal dependencies in each story unit, and stochastic observations are given by single shots. The skim is generated as an observation sequence, where, in order to privilege more informative segments for entering the skim, shots are assigned higher probability of observation if endowed with salient features related to specific film genres. The effectiveness of the method is demonstrated by skimming the first thirty minutes of a wide set of action and dramatic movies, in order to create previews for users useful for assessing whether they would like to see that movie or not, but without revealing the movie central part and plot details. Results are evaluated and compared through extensive user tests in terms of metrics that estimate the content representational value of the obtained video skims and their utility for assessing the user's interest in the observed movie

    A STATISTICAL FRAMEWORK FOR VIDEO SKIMMING BASED ON LOGICAL STORY UNITS AND MOTION ACTIVITY

    Get PDF
    In this work we present a method for video skimming based on hidden Markov Models (HMMs) and motion activity. Specifically, a set of HMMs is used to model subsequent log- ical story units, where the HMM states represent different visual-concepts, the transitions model the temporal dependencies in each story unit, and stochastic observations are given by single shots. The video skim is generated as an observation sequence, where, in order to privilege more informa- tive segments for entering the skim, dynamic shots are assigned higher probability of observation. The effectiveness of the method is demonstrated on a video set from different kinds of programmes, and results are evaluated in terms of metrics that measure the content representational value of the obtained video skims

    Interactive visualization of video content and associated description for semantic annotation

    Get PDF
    In this paper, we present an intuitive graphic fra- mework introduced for the effective visualization of video content and associated audio-visual description, with the aim to facilitate a quick understanding and annotation of the semantic content of a video sequence. The basic idea consists in the visualization of a 2D feature space in which the shots of the considered video sequence are located. Moreover, the temporal position and the specific content of each shot can be displayed and analysed in more detail. The selected fea- tures are decided by the user, and can be updated during the navigation session. In the main window, shots of the consi- dered video sequence are displayed in a Cartesian plane, and the proposed environment offers various functionalities for automatically and semi-automatically finding and annotating the shot clusters in such feature space. With this tool the user can therefore explore graphically how the basic segments of a video sequence are distributed in the feature space, and can recognize and annotate the significant clusters and their structure. The experimental results show that browsing and annotating documents with the aid of the proposed visuali- zation paradigms is easy and quick, since the user has a fast and intuitive access to the audio-video content, even if he or she has not seen the document yet
    • …
    corecore